Pesquisa | Portal Regional da BVS

Investigating the genetic makeup of the major histocompatibility complex (MHC) in the United Arab Emirates population through next-generation sequencing.

Marzouka, Nour Al Dain; Alnaqbi, Halima; Al-Aamri, Amira; Tay, Guan; Alsafar, Habiba.

Sci Rep ; 14(1): 3392, 2024 02 09.

Artigo em Inglês | MEDLINE | ID: mdl-38337023

RESUMO

The Human leukocyte antigen (HLA) molecules are central to immune response and have associations with the phenotypes of various diseases and induced drug toxicity. Further, the role of HLA molecules in presenting antigens significantly affects the transplantation outcome. The objective of this study was to examine the extent of the diversity of HLA alleles in the population of the United Arab Emirates (UAE) using Next-Generation Sequencing methodologies and encompassing a larger cohort of individuals. A cohort of 570 unrelated healthy citizens of the UAE volunteered to provide samples for Whole Genome Sequencing and Whole Exome Sequencing. The definition of the HLA alleles was achieved through the application of the bioinformatics tools, HLA-LA and xHLA. Subsequently, the findings from this study were compared with other local and international datasets. A broad range of HLA alleles in the UAE population, of which some were previously unreported, was identified. A comparison with other populations confirmed the current population's unique intertwined genetic heritage while highlighting similarities with populations from the Middle East region. Some disease-associated HLA alleles were detected at a frequency of > 5%, such as HLA-B*51:01, HLA-DRB1*03:01, HLA-DRB1*15:01, and HLA-DQB1*02:01. The increase in allele homozygosity, especially for HLA class I genes, was identified in samples with a higher level of genome-wide homozygosity. This highlights a possible effect of consanguinity on the HLA homozygosity. The HLA allele distribution in the UAE population showcases a unique profile, underscoring the need for tailored databases for traditional activities such as unrelated transplant matching and for newer initiatives in precision medicine based on specific populations. This research is part of a concerted effort to improve the knowledge base, particularly in the fields of transplant medicine and investigating disease associations as well as in understanding human migration patterns within the Arabian Peninsula and surrounding regions.

Assuntos

Antígenos de Histocompatibilidade Classe II , Antígenos de Histocompatibilidade Classe I , Humanos , Emirados Árabes Unidos , Frequência do Gene , Antígenos de Histocompatibilidade Classe I/genética , Antígenos de Histocompatibilidade Classe II/genética , Complexo Principal de Histocompatibilidade/genética , Sequenciamento de Nucleotídeos em Larga Escala , Haplótipos , Alelos , Cadeias HLA-DRB1/genética

Critical assessment of on-premise approaches to scalable genome analysis.

Al-Aamri, Amira; Kamarul Azman, Syafiq; Daw Elbait, Gihan; Alsafar, Habiba; Henschel, Andreas.

BMC Bioinformatics ; 24(1): 354, 2023 Sep 21.

Artigo em Inglês | MEDLINE | ID: mdl-37735350

RESUMO

BACKGROUND: Plummeting DNA sequencing cost in recent years has enabled genome sequencing projects to scale up by several orders of magnitude, which is transforming genomics into a highly data-intensive field of research. This development provides the much needed statistical power required for genotype-phenotype predictions in complex diseases. METHODS: In order to efficiently leverage the wealth of information, we here assessed several genomic data science tools. The rationale to focus on on-premise installations is to cope with situations where data confidentiality and compliance regulations etc. rule out cloud based solutions. We established a comprehensive qualitative and quantitative comparison between BCFtools, SnpSift, Hail, GEMINI, and OpenCGA. The tools were compared in terms of data storage technology, query speed, scalability, annotation, data manipulation, visualization, data output representation, and availability. RESULTS: Tools that leverage sophisticated data structures are noted as the most suitable for large-scale projects in varying degrees of scalability in comparison to flat-file manipulation (e.g., BCFtools, and SnpSift). Remarkably, for small to mid-size projects, even lightweight relational database. CONCLUSION: The assessment criteria provide insights into the typical questions posed in scalable genomics and serve as guidance for the development of scalable computational infrastructure in genomics.

Assuntos

Ciência de Dados , Genômica , Mapeamento Cromossômico , Bases de Dados Factuais , Análise de Sequência de DNA

Inferring Gene Regulatory Networks from RNA-seq Data Using Kernel Classification.

Al-Aamri, Amira; Kudlicki, Andrzej S; Maalouf, Maher; Taha, Kamal; Homouz, Dirar.

Biology (Basel) ; 12(4)2023 Mar 29.

Artigo em Inglês | MEDLINE | ID: mdl-37106719

RESUMO

Gene expression profiling is one of the most recognized techniques for inferring gene regulators and their potential targets in gene regulatory networks (GRN). The purpose of this study is to build a regulatory network for the budding yeast Saccharomyces cerevisiae genome by incorporating the use of RNA-seq and microarray data represented by a wide range of experimental conditions. We introduce a pipeline for data analysis, data preparation, and training models. Several kernel classification models; including one-class, two-class, and rare event classification methods, are used to categorize genes. We test the impact of the normalization techniques on the overall performance of RNA-seq. Our findings provide new insights into the interactions between genes in the yeast regulatory network. The conclusions of our study have significant importance since they highlight the effectiveness of classification and its contribution towards enhancing the present comprehension of the yeast regulatory network. When assessed, our pipeline demonstrates strong performance across different statistical metrics, such as a 99% recall rate and a 98% AUC score.

Forecasting the SARS COVID-19 pandemic and critical care resources threshold in the Gulf Cooperation Council (GCC) countries: population analysis of aggregate data.

Al-Aamri, Amira K; Al-Harrasi, Ayaman A; AAl-Abdulsalam, Abdurahman K; Al-Maniri, Abdullah A; Padmadas, Sabu S.

BMJ Open ; 11(5): e044102, 2021 05 11.

Artigo em Inglês | MEDLINE | ID: mdl-33980523

RESUMO

OBJECTIVE: To generate cross-national forecasts of COVID-19 trajectories and quantify the associated impact on essential critical care resources for disease management in Gulf Cooperation Council (GCC) countries. DESIGN: Population-level aggregate analysis. SETTING: Bahrain, Kuwait, Oman, Qatar, United Arab Emirates (UAE) and Saudi Arabia. METHODS: We applied an extended time-dependent SEICRD compartmental model to predict the flow of people between six states, susceptible-exposed-infected-critical-recovery-death, accounting for community mitigation strategies and the latent period between exposure and infected and contagious states. Then, we used the WHO Adaptt Surge Planning Tool to predict intensive care unit (ICU) and human resources capacity based on predicted daily active and cumulative infections from the SEICRD model. MAIN OUTCOME MEASURES: Predicted COVID-19 infections, deaths, and ICU and human resources capacity for disease management. RESULTS: COVID-19 infections vary daily from 498 per million in Bahrain to over 300 per million in UAE and Qatar, to 9 per million in Saudi Arabia. The cumulative number of deaths varies from 302 per million in Oman to 89 in Qatar. UAE attained its first peak as early as 21 April 2020, whereas Oman had its peak on 29 August 2020. In absolute terms, Saudi Arabia is predicted to have the highest COVID-19 mortality burden, followed by UAE and Oman. The predicted maximum number of COVID-19-infected patients in need of oxygen therapy during the peak of emergency admissions varies between 690 in Bahrain, 1440 in Oman and over 10 000 in Saudi Arabia. CONCLUSION: Although most GCC countries have managed to flatten the epidemiological curve by August 2020, trends since November 2020 show potential increase in new infections. The pandemic is predicted to recede by August 2021, provided the existing infection control measures continue effectively and consistently across all countries. Current health infrastructure including the provision of ICUs and nursing staff seem adequate, but health systems should keep ICUs ready to manage critically ill patients.

Assuntos

COVID-19 , Síndrome Respiratória Aguda Grave , Barein/epidemiologia , Cuidados Críticos , Humanos , Kuweit/epidemiologia , Omã/epidemiologia , Pandemias , Catar , SARS-CoV-2 , Arábia Saudita/epidemiologia , Emirados Árabes Unidos/epidemiologia

Inferring Causation in Yeast Gene Association Networks With Kernel Logistic Regression.

Al-Aamri, Amira; Taha, Kamal; Maalouf, Maher; Kudlicki, Andrzej; Homouz, Dirar.

Evol Bioinform Online ; 16: 1176934320920310, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-35173404

RESUMO

Computational prediction of gene-gene associations is one of the productive directions in the study of bioinformatics. Many tools are developed to infer the relation between genes using different biological data sources. The association of a pair of genes deduced from the analysis of biological data becomes meaningful when it reflects the directionality and the type of reaction between genes. In this work, we follow another method to construct a causal gene co-expression network while identifying transcription factors in each pair of genes using microarray expression data. We adopt a machine learning technique based on a logistic regression model to tackle the sparsity of the network and to improve the quality of the prediction accuracy. The proposed system classifies each pair of genes into either connected or nonconnected class using the data of the correlation between these genes in the whole Saccharomyces cerevisiae genome. The accuracy of the classification model in predicting related genes was evaluated using several data sets for the yeast regulatory network. Our system achieves high performance in terms of several statistical measures.

Predicting protein functions by applying predicate logic to biomedical literature.

Taha, Kamal; Iraqi, Youssef; Al Aamri, Amira.

BMC Bioinformatics ; 20(1): 71, 2019 Feb 08.

Artigo em Inglês | MEDLINE | ID: mdl-30736739

RESUMO

BACKGROUND: A large number of computational methods have been proposed for predicting protein functions. The underlying techniques adopted by most of these methods revolve around predicting the functions of an unannotated protein p from already annotated proteins that have similar characteristics as p. Recent Information Extraction methods take advantage of the huge growth of biomedical literature to predict protein functions. They extract biological molecule terms that directly describe protein functions from biomedical texts. However, they consider only explicitly mentioned terms that co-occur with proteins in texts. We observe that some important biological molecule terms pertaining functional categories may implicitly co-occur with proteins in texts. Therefore, the methods that rely solely on explicitly mentioned terms in texts may miss vital functional information implicitly mentioned in the texts. RESULTS: To overcome the limitations of methods that rely solely on explicitly mentioned terms in texts to predict protein functions, we propose in this paper an Information Extraction system called PL-PPF. The proposed system employs techniques for predicting the functions of proteins based on their co-occurrences with explicitly and implicitly mentioned biological molecule terms that pertain functional categories in biomedical literature. That is, PL-PPF employs a combination of statistical-based explicit term extraction techniques and logic-based implicit term extraction techniques. The statistical component of PL-PPF predicts some of the functions of a protein by extracting the explicitly mentioned functional terms that directly describe the functions of the protein from the biomedical texts associated with the protein. The logic-based component of PL-PPF predicts additional functions of the protein by inferring the functional terms that co-occur implicitly with the protein in the biomedical texts associated with it. First, the system employs its statistical-based component to extract the explicitly mentioned functional terms. Then, it employs its logic-based component to infer additional functions of the protein. Our hypothesis is that important biological molecule terms pertaining functional categories of proteins are likely to co-occur implicitly with the proteins in biomedical texts. We evaluated PL-PPF experimentally and compared it with five systems. Results revealed better prediction performance. CONCLUSIONS: The experimental results showed that PL-PPF outperformed the other five systems. This is an indication of the effectiveness and practical viability of PL-PPF's combination of explicit and implicit techniques. We also evaluated two versions of PL-PPF: one adopting the complete techniques (i.e., adopting both the implicit and explicit techniques) and the other adopting only the explicit terms co-occurrence extraction techniques (i.e., without the inference rules for predicate logic). The experimental results showed that the complete version outperformed significantly the other version. This is attributed to the effectiveness of the rules of predicate logic to infer functional terms that co-occur implicitly with proteins in biomedical texts. A demo application of PL-PPF can be accessed through the following link: http://ecesrvr.kustar.ac.ae:8080/plppf/.

Assuntos

Lógica , Proteínas/metabolismo , Publicações , Bases de Dados Genéticas , Ontologia Genética , Genoma Fúngico , Armazenamento e Recuperação da Informação , Anotação de Sequência Molecular , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/genética

Analyzing a co-occurrence gene-interaction network to identify disease-gene association.

Al-Aamri, Amira; Taha, Kamal; Al-Hammadi, Yousof; Maalouf, Maher; Homouz, Dirar.

BMC Bioinformatics ; 20(1): 70, 2019 Feb 08.

Artigo em Inglês | MEDLINE | ID: mdl-30736752

RESUMO

BACKGROUND: Understanding the genetic networks and their role in chronic diseases (e.g., cancer) is one of the important objectives of biological researchers. In this work, we present a text mining system that constructs a gene-gene-interaction network for the entire human genome and then performs network analysis to identify disease-related genes. We recognize the interacting genes based on their co-occurrence frequency within the biomedical literature and by employing linear and non-linear rare-event classification models. We analyze the constructed network of genes by using different network centrality measures to decide on the importance of each gene. Specifically, we apply betweenness, closeness, eigenvector, and degree centrality metrics to rank the central genes of the network and to identify possible cancer-related genes. RESULTS: We evaluated the top 15 ranked genes for different cancer types (i.e., Prostate, Breast, and Lung Cancer). The average precisions for identifying breast, prostate, and lung cancer genes vary between 80-100%. On a prostate case study, the system predicted an average of 80% prostate-related genes. CONCLUSIONS: The results show that our system has the potential for improving the prediction accuracy of identifying gene-gene interaction and disease-gene associations. We also conduct a prostate cancer case study by using the threshold property in logistic regression, and we compare our approach with some of the state-of-the-art methods.

Assuntos

Epistasia Genética , Redes Reguladoras de Genes , Predisposição Genética para Doença , Humanos , Modelos Logísticos , Masculino , Neoplasias da Próstata/genética , Curva ROC

Constructing Genetic Networks using Biomedical Literature and Rare Event Classification.

Al-Aamri, Amira; Taha, Kamal; Al-Hammadi, Yousof; Maalouf, Maher; Homouz, Dirar.

Sci Rep ; 7(1): 15784, 2017 Nov 17.

Artigo em Inglês | MEDLINE | ID: mdl-29150626

RESUMO

Text mining has become an important tool in bioinformatics research with the massive growth in the biomedical literature over the past decade. Mining the biomedical literature has resulted in an incredible number of computational algorithms that assist many bioinformatics researchers. In this paper, we present a text mining system called Gene Interaction Rare Event Miner (GIREM) that constructs gene-gene-interaction networks for human genome using information extracted from biomedical literature. GIREM identifies functionally related genes based on their co-occurrences in the abstracts of biomedical literature. For a given gene g, GIREM first extracts the set of genes found within the abstracts of biomedical literature associated with g. GIREM aims at enhancing biological text mining approaches by identifying the semantic relationship between each co-occurrence of a pair of genes in abstracts using the syntactic structures of sentences and linguistics theories. It uses a supervised learning algorithm, weighted logistic regression to label pairs of genes to related or un-related classes, and to reflect the population proportion using smaller samples. We evaluated GIREM by comparing it experimentally with other well-known approaches and a protein-protein interactions database. Results showed marked improvement.

Assuntos

Mineração de Dados , Redes Reguladoras de Genes , Publicações , Genes , Curva ROC

Disentangling age-gender interactions associated with risks of fatal and non-fatal road traffic injuries in the Sultanate of Oman.

Al-Aamri, Amira K; Padmadas, Sabu S; Zhang, Li-Chun; Al-Maniri, Abdullah A.

BMJ Glob Health ; 2(3): e000394, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-29018585

RESUMO

OBJECTIVE: Road traffic injuries (RTIs) are the leading cause of disability-adjusted life years lost in Oman, Saudi Arabia and United Arab Emirates. Injury prevention strategies often overlook the interaction of individual and behavioural risk factors in assessing the severity of RTI outcomes. We conducted a systematic investigation of the underlying interactive effects of age and gender on the severity of fatal and non-fatal RTI outcomes in the Sultanate of Oman. METHODS: We used the Royal Oman Police national database of road traffic crashes for the period 2010-2014. Our study was based on 35 785 registered incidents: of these, 10.2% fatal injuries, 6.2% serious, 27.3% moderate, 37.3% mild injuries and 19% only vehicle damage but no human injuries. We applied a generalised ordered logit regression to estimate the effect of age and gender on RTI severity, controlling for risk behaviours, personal characteristics, vehicle, road, traffic, environment conditions and geographical location. RESULTS: The most dominant group at risk of all types of RTIs was young male drivers. The probability of severe incapacitating injuries was the highest for drivers aged 25-29 (26.6%) years, whereas the probability of fatal injuries was the highest for those aged 20-24 (26.9%) years. Analysis of three-way interactions of age, gender and causes of crash show that overspeeding was the primary cause of different types of RTIs. In particular, the probability of fatal injuries among male drivers attributed to overspeeding ranged from 3%-6% for those aged 35 years and above to 13.4% and 17.7% for those aged 25-29 years and 20-24 years, respectively. CONCLUSIONS: The high burden of severe and fatal RTIs in Oman was primarily attributed to overspeed driving behaviour of young male drivers in the 20-29 years age range. Our findings highlight the critical need for designing early gender-sensitive road safety interventions targeting young male and female drivers.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA